knitr::opts_chunk$set(eval = FALSE)

In this practical, you will make a GitHub repository, create an R markdown document, and track changes in this document using Git. You can do this using a desktop interface for Git, or the command line; instructions for both are given.

What is a GitHub repository

A repository is essentially the same as a directory: a folder, containing your files for one project. It is not limited to code; it can also store images, text files, etc. The idea is that, a bit like Dropbox, there will be a copy of this folder on the GitHub website, which can be copied to multiple different computers. The big plus is that Git was designed for version control, keeping “snapshots” of each point in your repository’s history without you having to have 20 different versions of each file with

As a member of the jknightlab GitHub group, you can have repositories under your account that only you can see, as well as repositories associated with the lab account that everyone can access. For example, I have personal repositories for my thesis and for odds and ends that don’t go anywhere else, and in the jknightlab group I use the “GAiNS” and “CardiacSurgery” repositories. All of these are synced to my work computer and my laptop, and other people on the GAinS project can also have copies. These can also be public or private repositories, controlling whether people outside the group can see them.

Creating a GitHub respository

  1. Open a web browser and go to the GitHub website

  2. Sign in and you should see something like this:

  1. Click on the green “New repository” button under “Your repositories”

  2. Fill in the details: if you want you can make a repository that you will continue to use, or just call it “Git_Practice” and delete it at the end. Select your username as the owner unless you are making a repository for a lab project not already included in the lab account.

  3. Tick “initialize this repository with a README” and click “Create repository”

  1. You will now be taken to the home page for your new repository. This shows you all the files in your folder (currently just a README), and allows you to edit them directly. Have a look around; there are a LOT of options and you can do a lot of things that are beyond the scope of this introduction.

Editing files on GitHub

  1. Click on “README.md” in the file list. This will open the README file (the page will look pretty similar as it is the only file in your repository at the moment)

  2. To edit the file, click the pencil symbol at the top right.

  1. You now have this file open in a web text editor. It is a markdown file, so the formatting is controlled using markdown tags (e.g. # for a header, ** for bold text). Markdown is used across GitHub for formatting text (technically, GitHub Flavored Markdown, so with a few minor differences). This means that markdown files are displayed as much nicer looking web pages. Make some changes (e.g. see below), and use the “Preview changes” tab to see the output rendered as a web page. You can see that anything you have added to the file will have a green bar next to it, and anything you have deleted a red bar.
# GitHub_and_Rmd_Introduction

This is my practice markdown document.

Use a blank line to separate paragraphs

## A second level header

A list:

- item one
- item two

and a numbered list

1. you
2. get
3. the point

### A third level header

**Some bold text** and _some italic text_

And a [link](http://kbroman.org/knitr_knutshell/pages/markdown.html)
  1. Commit your changes. This is a fancy (quicker) way of saying “save the file, but remember what has been changed from the previous version”. Give the commit a short, descriptive name so that if you ever want to find or revert to a previous version of your file, you can find it easily. You can also add a longer, optional description to keep track of what you are doing.

  2. If you navigate back to the repository home by clicking on its name at the top of the page, you will see that your commit is now logged and the README displayed has been updated. First, click on the clock with “2 commits” next to it underneath the repository description. This shows you the history of your changes. Click on the name of the latest commit, and you will see the old and new versions of your file side by side (diff view - you can also have a unified view, which is more like Word track changes).

Cloning a repository to your local computer

The GitHub website has a lot of great features, but really what you want is to edit your files as normal on your computer, and then update the repository on GitHub with your changes. To do this, you first need a local copy, or clone, of your repo.

GitHub Desktop is a GUI (graphical user interface) for the Git software, which was originally just used through the command line. You can do the same things in this programme, with the benefit of it being a little more familiar. When you download it, it installs Git on your computer, so you can also use it without the GUI. Another way to use Git (which I have very limited experience with) is called GitKraken - you might find you prefer this. The screenshots are from the Windows version, but I’ve noted where the Mac version looks different.

(If you want to try both the GUI and command line approaches, you can clone your repo to another location the second time round and make another small change to your document to commit.)

Doing the same thing in a shell

  1. Open Git Bash/Terminal/sign into Galahad

  2. If you haven’t used this before, you will have to configure Git i.e. tell it your login details so it can sync with GitHub. To do this, use

git config --global user.name "Your Name Here"

git config --global user.email "your_email@well.ox.ac.uk"
  1. Navigate to where you want to clone your repository to (cd /MyPath)

Drawing

  1. On GitHub, click “Clone or download” under your repository name on the main page of your repository. In the clone with HTTPs section, click the clipboard icon to copy the clone URL. Paste it into the command below, and your repo will be cloned to your local computer. Use ls to see the README.md file that you made.
git clone https://github.com/jknightlab/GitHub_and_Rmd_Introduction.git
  1. To review the history of your repository, use git log. To look in detail at a particular commit, use git show [commit]

If you carry on using the command line, you can set it up so that you don’t have to enter your user name and password every time.

Editing R Markdown files

  1. Open your README.md file in RStudio (the second half of this tutorial will go into R Markdown in more detail). The markdown should be displayed in the top left panel, which is your text editor.

  2. Make a small change to your README file, e.g.

An **extra** _line_ in the file with `a code block`
  1. Click the Preview button to see the updated output, as before.

Committing Changes

With GitHub Desktop

  1. This change has been saved in the local copy of our file. Your git software will have also noticed that the file has been changed. If you return to GitHub Desktop, you will see a dot has appeared on the Changes tab, indicating that you have some changes that have not been committed. Unsurprisingly, your README.md file has been edited, and you can see the diff view of this as on the GitHub website. You will also see that a file called README.html has appeared. RStudio created this when you previewed your markdown file, and this is really the output of a markdown document. If you look in the folder, you will find this file, and you can open it and look at it in your web browser.

  2. The html file is nice to look at, but I generally don’t want to synchronise it. I can regenerate it from the markdown file and GitHub will render it automatically. Therefore, we will only commit the changes to the markdown file. We could just untick it in the list before making the commit, but we actually want git not to pay attention to it from now on. Therefore, we are going to ignore it. Right click on the html file in the list of changes, and select “Ignore file”. It will be replaced in the list by a new change, called .gitignore. This is a hidden file that you have just created, which lists all the files that you don’t want git to track. This can be types of file as well as individual files - and don’t worry, you can edit this in the future.

  1. As on the website, fill in a commit summary and description at the bottom of the window - it is probably better practice to do the edit and the ignore as two separate commits as they are unrelated changes. I try to have each commit I do a distinct task, which could involve editing multiple different files but all with the same goal in mind. However, in this case, it doesn’t matter!

  2. Click “Commit to master”. The changes will vanish from the “Changes” tab, and the commit is added to the “History” tab.

  3. We have now changed our minds about the change we made and want to revert to the former version. It is not possible to undo a commit on GitHub (well, you can just edit the file again of course) but on your computer you can return to a previous version of your files. To do this, click on the commit with the minor edit of your README file, and click “Revert” on the top right.

Doing the same thing in a shell

  1. Use git status to see what files have been changed in your local repository. Use git diff to see the changes.

  2. The equivalent of ticking the files you want to add to a commit is git add [file name]. To ignore a file, create a file called “.gitignore” and add the file name to it (e.g. use vim)

vim .gitignore
Shift + i

*.html

ESC
:wq
  1. To make a commit, use git commit -m [descriptive message]. All the changes that you have **staged** usingadd` will be in this commit.

  2. You have now committed these changes locally. Use git status again to check this.

  3. To undo this commit, use git reset [commit]. This undoes all commits after the commit named in the command.

Pushing changes to GitHub

Finally, we want to upload our local changes to GitHub. This is called pushing. With GitHub Desktop, this is done by pressing the “Sync” button. Your local repository is now synchronised with your online one. Navigate there to check this. If you or someone else then wants to work on a copy of these files on another computer, they can clone the repository as we did earlier.

With the command line, use git push. To download changes from GitHub, use git fetch.

Troubleshooting conflicts (Not super important right now)

There is a sort of opposite task to push called pull, which as you might imagine, does download changes to your repository from GitHub. HOWEVER, this can occasionally cause problems, as this process also involves merging those changes with any changes you have made locally. Behind the scenes, pull is actually a combination of two separate tasks: fetch (downloading changes from GitHub to a local branch) and merge (merging them to your local master branch).

I’m going into detail on this because the worst thing about GitHub Desktop is that it only has a button called “Sync”, which simultaneously pushes and pulls changes, rather than letting you do them separately, or fetching them. In some ways this is simpler, as With what we have done so far, pulling will work just fine. It is also fine most of the time when you are the only person working on a file (which is mostly how I use GitHub - I have shared repositories, but we generally don’t edit the same file). The only time I’ve had issues with this is when I’ve edited a file on my laptop, and uploaded the changes to GitHub. I’ve then edited the same file on my desktop without first pulling these changes. Then, when I try to upload those changes, Git understandably doesn’t know how to combine them. It will then tell me that I have a merge conflict. This essentially means that it puts both sets of changes into my file, surrounded by tags telling me which sections are confusing. I can then edit this file so that it has the changes I want, and once I’ve deleted the merge conflict tags, I can commit the changes as normal. Obviously, I can avoid this by remembering to pull down changes before I start working on the file. It is very good at coping with multiple changes to the same file but in different sections, as well.

This sort of forgetfulness can also cause problems with with the command line, but you have a bit more control with the use of pull and fetch separately. More importantly for Git use in general, though perhaps not for today, is that branches can be used to have different sets of changes in parallel.

If you want to practice dealing with this, in case this happens to you by mistake:

  1. open your local README file and delete a word.

  2. Then go to the same file via GitHub, and delete a different word in the same line.

  3. Commit both changes,

  4. and then press Sync in GitHub Desktop.

  5. You will get an error message, so click OK and go back to your file in RStudio to sort this out.

You will see something like this:


<<<<<<< HEAD
This is my practice document.
=======
This is my markdown document.
>>>>>>> origin/master
  1. Delete the line that you don’t want to keep, and remove the lines that Git has added:
This is my practice markdown document.
  1. Save changes, and commit them (the file will have an exclamation mark next to it in your GitHub Desktop list of changes). The commit message will have been filled in for you. There will also be a little picture of a branch in your timeline to show that there were two versions of your file.